Job Provenance - Insight into Very Large Provenance Datasets
نویسندگان
چکیده
Following the job-centric monitoring concept, Job Provenance (JP) service organizes provenance records on the per-job basis. It is designed to manage very large number of records, as was required in the EGEE project where it was developed originally. The quantitative aspect is also a focus of the presented demonstration. We show JP capability to retrieve data items of interest from a large dataset of full records of more than 1 million of jobs, to perform nontrivial transformation on those data, and organize the results in such a way that repeated interactive queries are possible. The application area of the demo is derived from that of previous Provenance Challenges. Though the topic of the demo—a computational experiment— is arranged rather artificially, the demonstration still delivers its main message that JP supports non-trivial transformations and interactive queries on large data sets.
منابع مشابه
Provenir ontology: Towards a Framework for eScience Provenance Management
Management Satya S. Sahoo, Amit P. Sheth Kno.e.sis center, Computer Science and Engineering Department, Wright State University, Dayton, OH-45324, USA {sahoo.2, amit.sheth}@wright.edu Abstract Provenance metadata describes the “lineage” or history of an entity and necessary information to verify the quality of data, validate experiment protocols, and associate trust value with scientific result...
متن کاملProvenance Capture Disparities Highlighted through Datasets
Provenance information is inherently affected by the method of its capture. Different capture mechanisms create very different provenance graphs. In this work, we describe an academic use case that has corollaries in offices everywhere. We also describe two distinct possibilities for provenance capture methods within this domain. We generate three datasets using these two capture methods: the c...
متن کاملExploring Provenance in a Distributed Job Execution System
We examine provenance in the context of a distributed job execution system. It is crucial to capture provenance information during the execution of a job in a distributed environment because often this information is lost once the job has finished. In this paper we discuss the type of information that is available within a distributed job execution system, how to capture such information, and w...
متن کاملCharacterizing users' visual analytic activity for insight provenance
Insight provenance—a historical record of the process and rationale by which an insight is derived—is an essential requirement in many visual analytics applications. While work in this area has relied on either manually recorded provenance (e.g., user notes) or automatically recorded event-based insight provenance (e.g., clicks, drags, and key-presses), both approaches have fundamental limitati...
متن کاملProvenance Algebra and Materialized View-based Provenance Management
Provenance, from the French word „provenir‟ meaning "to come from", describes the lineage of an entity. Provenance is critical information in eScience to accurately interpret scientific results. Though information provenance has been recognized as a hard problem in computing science (British Computing Society, 2004), many fundamental research issues in provenance have yet to be addressed. A com...
متن کامل